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1 Introduction 

In the past 20 years, there has been a significant amount of work done on understanding the 
approximability of various constraint satisfaction problems (CSPs). 

For the purposes of this paper, a CSP is defined by a fc-ary predicate P : {-l,l} k -> {0,1} 
over a Boolean alphabetic An instance consists of a set of constraints, each of which dictates that 
P applied to some list of k literals should be satisfied (a literal is a variable or the negation of a 
variable). The objective is to find an assignment to the variables so as to maximize the number of 
satisfied constraints. Two well-known examples are Max £;-Sat (where P is the disjunction of the 
k input bits) and Max &-Lin (where P is the parity of the k input bits). 

Essentially every Max CSP is NP-hard (the exception being when P only depends on one of 
its input bits). In terms of approximability, it is easy to see that choosing a uniformly random 
assignment to the variables, without even looking at the instance, yields an approximation ratio of 
|P -1 (l)|/2 fc , where |P~ 1 (1)| is the number of inputs in {—1, l} fe that satisfy P. 

Improving upon this trivial algorithm turns out to be surprisingly difficult. In a groundbreaking 
paper, Goemans and Williamson [GW95] used semidefinite programming (SDP) to give improved 
approximation algorithms for Max 2-Sat and Max 2-Lin. SDP was soon used to give better 
approximation algorithms for many other problems as well, but for some CSPs, perhaps most 
prominently Max 3-Sat and Max 3-Lin, no improvement over the random assignment algorithm 
was found. Then, in a new breakthrough, Hastad fH asOl] showed that such an improvement would 
not be possible: approximating Max 3-Sat within 7/8 + e or Max 3-Lin within 1/2 + e for some 
e > is NP-hard. In other words, Max 3-Sat and Max 3-Lin have the remarkable property that 
the completely mindless random assignment algorithm is optimal! 

CSPs which have this property - that they are NP-hard to approximate within |P _1 (l)|/2 fc + e 
- are called approximation resistant. Following Hastad's initial result, many more CSPs have been 
shown to be approximation resistant [GLST98, ST00, EH08, Has05j. Fairly quickly, a complete 
characterization of approximation resistance for predicates of arity three was found: P : {— 1, l} 3 — >• 
{0, 1} is approximation resistant if and only if P accepts all inputs of odd parity, or if it accepts all 
inputs of even parity |Has011 IZwT98] . 

However, the next small case, predicates of arity 4, is still not completely classified, and it is 
not at all clear whether there is a nice, clean characterization. We would like to emphasize that 
by a characterization we mean a necessary and sufficient condition. Modulo symmetries, there are 
400 non-constant predicates of arity 4. Hast [Has05 showed 275 of these to be approximable, 79 
of them to be approximation resistant, and left the status of the remaining 46 open. 

In recent years, progress has been made on our understanding of approximation resistance under 
the assumption of the Unique Games Conjecture (UGC) [Kho02]. The first author and Mossel 
[AM09| proved that assuming the UGC, P is approximation resistant if there exists an unbiased 
pairwise independent distribution over {— l,l} fe supported on P _1 (l). Using this condition, it 
can be shown that as k — > oo, an overwhelming fraction of all predicates are in fact approximation 
resistant [AHllj . A somewhat more (complicated and) general sufficient condition is known |AH12| . 
As in |AM09j . this condition is in terms of the biases and pairwise correlations of distributions 
supported on P. At this point, it seems unlikely that there is a clean characterization (necessary 
and sufficient), but one can hope that approximation resistance is at least decidable. 

Relevant here is the work of Raghavendra | Rag08 1 , which shows assuming the UGC that for any 



CSP, its approximability threshold is determined by the integrality gap of a natural SDP relaxation 



1 As is common, the input bits are written in the { — 1, 1} notation with —1 interpreted as logical True and 1 as 
logical False. Also "parity" corresponds to taking product of the bits: odd parity means the product is —1 and even 
parity means the product is 1. 
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for the problem. Furthermore, Raghavendra and Steurer [RS09] show that this integrality gap can 
be approximated to within an additive error e (in time doubly exponential in e). 

This "almost" shows that it is decidable to determine whether a CSP is approximation resistant. 
However, as we have no a priori bound on the error e needed, it only shows that it is recursively 
enumerable to determine whether a CSP is approximable. Note that, for every k there is a smallest 
gap €k such that any approximable predicate P on k bits can be approximated within at least 
\p- 1 (l)\/2 k + e k . If this number efe can be computed, approximation resistance would be decidable, 
but it is possible (though seemingly unlikely) that tends to faster than any computable function. 

1.1 Our Contribution 



The strength of |Rag08|, namely that it works in a black-box fashion for any CSP, is in some sense 



a weakness in this setting, as it is not explicit and does not give any insight into what structural 
properties cause a predicate to be approximation resistant. In this paper, we make progress towards 
an explicit characterization of approximation resistance. We restrict the class of CSPs we study in 
two ways. 

1. We only consider k-partite instances. In a fc-partite instance, the variables are grouped into 
k layers, and in each constraint, the literal passed as the i'th argument to P comes from the 
i'th layer. 

2. We only consider P which are even. P is even if P(x) = P(—x) for every x G {—1, l} fc , where 
—x denotes bitwise negation of x. 

We refer to this as the Max PartCSP(P) problem. Our main contribution is an explicit 
necessary and sufficient characterization (assuming the UGC) of when Max PartCSP(P) is ap- 
proximation resistant. As in the case of [AM09] and its generalizations, our condition is based on 
the existence of certain distributions p over the set of satisfying assignments of P and furthermore 
the conditions on these distributions depend only on their pairwise correlations E^XiXj]. 

In order to properly state the characterization, we need to make a few definitions. 

Definition 1.1. Let G = (S,E) be a multigraph with vertex set S C [k] and no self-loops. For a 
correlation matrix p 6 WL kxk we define p{G) = Ilr/e-E Piy ^ or a distribution A over k x k correlation 
matrices we define A(G) = E pe \[p(G)]. 

The key part of our definition is the existence of distributions A over correlation matrices each 
of which arises from a distribution over P _1 (l) - we refer to these as P-supported correlation 
matrices - such that A(G) vanishes on certain graphs. Specifically: 

Definition 1.2. Let A be a distribution over k x k correlation matrices. We say that A is m- 
vanishing on P if: 

1. A is a distribution over P-supported correlation matrices. 

2. For every S C [k] such that P(S) ^ 0, and every odd-degree multigraph G on S with at most 
m edges, it holds that A(G) = 0. 

Here P(S) denotes the Fourier coefficient of the predicate P on the set S (i.e. the coefficient 
of the monomial HiGS^* when P is written as a multi-linear polynomial). Now we can state our 
main result. 



Theorem 1.3. Assuming the UGC, Max PartCSP(P) is approximation resistant if and only if 
for every positive integer m there exists a distribution A which is m-vanishing on P. 
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Note that if there is a pairwise independent distribution supported on P, i.e., if the identity 
matrix is P-supported, then taking A to be the singleton distribution on the identity matrix is 
m-vanishing on P for every m. As such, this characterization generalizes the sufficient condition of 
[AM09| . 

Given m and P, it is fairly easy to prove that the existence of an m-vanishing A on P is decidable. 
Hence the condition of Theorem 11.31 is recursively enumerable. We feel that this characterization 
is promising with respect to decidability. For instance, it is quite possible that one can show some 
explicit upper bound on the largest value of m that one needs to check, which would immediately 
give decidability. We also remark that, even though we do not prove it here, the characterization 
in Theorem 11.31 is equivalent to saying that there is a distribution A which is m-vanishing on P for 
all m simultaneously. 

1.2 Proof Ideas 

We now briefly and informally outline the main ideas of the proof of Theorem 11.31 

Algorithm. Suppose there is no m-vanishing distribution A for some m. By LP duality, there 
are then constants {7g} such that Y^g^gp(G) > 5 for all P-supported p, where the sum is over 
all odd-degree G on at most m edges. Now, assume we are given a solution to the basic SDP 
relaxation for Max PartCSP(P) with value 1 (in reality it will only have value close to 1 but this 
is just a small technicality). Then for each constraint we have a local distribution p and since the 
SDP value is 1 its correlation matrix p is P-supported. The basic idea is, very loosely, to design a 
rounding algorithm which, given some graph G, finds an assig nment with value \P~ 1 (l)\/2 k + p(G). 
Picking a random G with probability proportional to then gives an assignment with value 

Ip-Hi)]^ + n(5). 

To get an assignment with value |P _1 (l)|/2 fc + p(G), the idea is to do as follows. For simplicity, 
suppose V(G) = [k] and consider the monomial rii=i x *- We can construct the solution iteratively 
edge by edge, as follows. Initially, set all = 1 (corresponding to the empty graph). Then, for 

an edge e = pick a standard Gaussian vector g g , and multiply x% (resp. Xj) by (^g e ,YiJ 

(resp. \S e :Yjj), where Vj and Vj are the vectors in the SDP solution corresponding to Xi and Xj. 

This operation has the effect of multiplying E[]T ie rju Xi] by a factor (v i ,v J -) = p^ where p^ is the 
correlation between i and j in the local distribution on x\, . . . ,x^. Repeating this for all edges of 
the graph, we get E[TJ ig [ fc j xi\ = p(G), and we can make sure that all other non-constant monomials 
have expectation 0, meaning that we get an advantage of p(G) over |P~ 1 (l)|/2 fe + p{G). 

To wrap this up and get the formal proof, there are som additional technicalities to account for: 
the values assigned by the above rounding are not Boolean- valued, we need to deal with negated 
literals, and we need to take the magnitude of the Fourier coefficients P(S) of P into consideration. 
The formalization of the "monomial rounding" described above is given in Lemma [4. II in Section I3~T1 
and its use to give a non-trivial algorithm for Max PartCSP(P) is then described in Section [4.31 

Hardness. As is by now standard, the task of proving hardness boils down to constructing a 
dictatorship test using the predicate P. The dictatorship test gets oracle access to k functions 
fli ■ ■ ■ j fk '■ {— 1) l} n ~~ ► { — 1) 1}) and the question is whether fi = f2 = ■ ■ ■ = fk are all equal to 
some dictatorship function. The test operates by picking k inputs xi,...,Xk and then accepts iff 
P(fi( x i), 72(^2), • • • >fk( x k)) = 1- The restriction to only using P as the acceptance predicate is 
what gives us hardness for Max CSP(P) rather than an arbitrary CSP, and the restriction that 
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we have k different functions and make one query to each, instead of a single function, is precisely 
what gives us hardness for Max PartCSP(P) instead of Max CSP(P). 

Such a test is completely specified by the distribution of (x±, . . . , Xk)- To specify this we 
choose some very large m and use the m- vanishing distribution A guaranteed to exist. To sample 
(xi, . . . , Xk), we do as follows: first sample a P-supported correlation matrix p according to A, and 
let ji be some P-supported distribution with correlation matrix p. Then, for each i 6 [n] we sample 
the i'th coordinate (x\, . . . ,xV) independently from p. The completeness of the test follows by p 
being P-supported. The soundness follows using the invariance principle: first, we show that if the 
functions fi, ■ ■ ■ , fk have low influence the acceptance probability (appropriately arithmetized) can 
be well approximated by a multilinear polynomial in Gaussian variables with the same second mo- 
ments as x. Since higher moments of Gaussian variables are determined by their covariance matrix, 
this multilinear polynomial (and therefore also the acceptance probability) can be expressed as a 
function of the covariance matrix, i.e., p, and it turns out that all terms except for the constant 
|P -1 (l)|/2 fe are of the form p(G) for some odd-degree graph G on less than m edges. Hence taking 
the expectation over p ~ A, all non-constant terms vanish. 

Source of the various restrictions. It may be instructive to point out where the various restrictions 
we impose come into play. 

A;- parti ten ess. The fact that we know for each variable what "role" it will play is critical in allowing 
us to obtain the algorithm. In particular, in the "monomial rounding" described above, it 
is important that any given variable corresponds to some given vertex of the graph G that 
we are using (the vertices of G correspond to layers of the CSP instance). If a vertex could 
appear as several different vertices of G (i.e., in several different layers), it is not clear how 
to round it in such a way that the different occurences don't interfere with each other. 

Even predicates. This restriction is in some sense minor and more technical in nature. It allows us 
to assume that the distributions p supported on P~ l {\) are unbiased, which simplies may 
arguments. That said, it is not clear exactly how to generalize the present characterization 
to a general P. 

Odd-degree graphs. The reason why the characterization only involves odd-degree graphs is essen- 
tially the presence of negated literals. First, in the algorithm it turns out that it is necessary 
for the graphs to be odd-degree, as this essentially ensures that we don't have cancellations 
when dealing with negated literals. Second, in the hardness result it turns out that it is 
sufficient for the graphs to have odd-degree, because the functions fi we are testing can be 
assumed to be odd by the standard technique of folding, which is implemented by introducing 
negated literals. 



1.3 Discussion 

On the unnecessity of pairwise independence. It is known that there are approximation resistant 
predicates which do not support a pairwise independent distribution. A basic such example is the 
predicate GLST : {-1, l} 4 {0, 1} defined by 



GLST(xi,X2,x 3 ,x 4: ) 



x 2 / x 3 if x\ = -1 
X2 7^ a?4 if x\ = 1 



This predicate was shown to be approximation resistant by Guruswami et al. [GLST98], but there 
is no pairwise independent distribution supported on its accepting assignments - indeed it is not 
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difficult to check that X2X3 + X2X^ + X3X4 < for all accepting inputs. In [AH12], Theorem VIII. 6, 
a generalization of the pairwise independence condition was given which also covers the GLST 
predicate and in fact as far as we are aware cover all currently known examples of approximation 
resistant predicates. 

The condition of Theorem 11.31 essentially generalizes the condition of [AH 12]. We say "es- 
sentially" because Theorem VIII. 6 of [AH12] in some cases allows for a condition referred to as 
{i, j}-negativity, and it is not clear that this condition is captured by Theorem 11.31 The only ex- 
ample given in |AH12] using the {i, j}-negativity condition is not an even predicate, so it is possible 
that this is a distinction between even P and general P. On the other hand, it appears that for the 
example given in |AH12j. one can prove approximation resistance without using {i, j}-negativity 
so it is not completely clear whether allowing this adds any new predicates. Another possibility 
is that this is a distinction between Max CSP(P) and Max PartCSP(P), because the proof in 
[AH12] that {i, j}- negativity suffices does not extend to partite instances. In short, the situation 
is a bit of a mystery and may warrant further study. 



On Max PartCSP(P) vis-a-vis Max CSP(P). It is not known whether Max PartCSP(P) be- 
haves differently from Max CSP(P) with respect to approximation resistance. Almost all proofs of 
approximation resistance for Max CSP(P), including NP-hardness results such as [HasOU [EH08 . 
can be adjusted to produce fc-partite instances, thereby showing approximation resistance for 
Max PartCSP(P). 

However, one exception is the result of Raghavendra |Rag08|, where it is not at all clear how 
to achieve this. If it were the case that the reduction of Rag08 can be adjusted to produce 
partite instances, our restriction to fc-partite instances would have been without loss of generality 
(assuming the UGC), but as matters stand, this can not be deduced. 

Another exception is the hardness derived in [AH 12J from the {i, j}-negativity condition men- 
tioned above. 



1.4 Outline 

In Section [2] we introduce notation and terminology used throughout the paper and state some 
known theorems that we need. In Section [3] we describe how to decide whether an m-vanishing 
distribution exists. We then proceed to prove Theorem 11.31 giving an algorithm in Section [J] and 
proving hardness in Section [5j 



2 Notation and definitions 



As is common, for convenience of notation we use {—1, 1} for Boolean values rather than {0, 1}. 
Throughout, P denotes a fc-ary predicate P : {—1, l} fc — > {0, 1} which we assume to be even, i.e., 
P(x) = P(—x) for all x. 

We say a distribution /i over { — 1, l} k is P-supported if supp(^x) C P _1 (l). Similarly a correla- 
tion matrix p £ M. kxk is P-supported if there is a P-supported p such that pij = E^fxjXj] for all i, j. 
Note that since P is even, any P-supported distribution can without loss of generality be assumed 
to be unbiased, i.e., satisfying E At [xj] = for all i, as far as its correlation matrix is concerned (since 
we can spread the probability mass equally on any pair of assignments x and — x without affecting 
the correlation matrix). 

For the purposes of this paper, a multigraph is a graph G which may have multiple edges but no 
self- loops. A multigraph has odd degree if every vertex of the graph has odd degree (when edges are 
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counted with multiplicities). A key role in our characterization is played by multigraphs G whose 
vertex set is some subset S C [k], we refer to this as a multigraph on S. 

We write S n for the n-dimensional unit sphere (i.e., the set of unit vectors in R n+1 , and for two 
vectors u, v G W 1 we write (u, v) for their standard inner product. 

2.1 Partite Max-CSP and its SDP relaxation 

An instance Vl/ of Max PartCSP(P) has k ■ n Boolean variables indexed by [k] x [raj. Each 
constraint is of the form P(b±xi t j 1 , &2^2,j 2 ) • • • > ^k x k,j k ) for some indices ji, ■ ■ ■ ,jk and some signs 

h,...,b k e {-1,1}. 

We use the following notation. The constraints of an instance are (T\,P\), (T2,P^), where 
Tj C [k] x [n] are the set of variables that the i'th constraint depends on - exactly one from each 
layer - and P{ : {—1, l} Ti — > {0, 1} is P applied to the variables of Tj, possibly with some variables 
negated. 

We say that is a-satisfiable if there is an assignment to the variables which satisfies an a 
fraction of all the constraints. 

The basic SDP relaxation is described in Figure [TJ It has as variables a vector Vjj G S n ' k 
for every variable x%j, and an unbiased distribution /Xj over {—1, l} Ti for each constraint (Tj,Pj). 
The fact that this is a relaxation follows from the following observation: for any global integral 
assignment a G {—1, l} fc ™, let T> be the uniform distribution over the pair of integral assignments a 
and —a. Let \ii be the restrictions of T> to the respective sets Tj and Vjj = o~ij be a 1-dimensional 
vector. Then it is easy to see that this is a feasible solution to the SDP and its objective is same 
as the fraction of constraints satisfied by a (or —o~). Here we use the evenness of the predicate. 



for every i 

for all £ [k] x [n] 

where T = T{ n Tj 

for all (h,ji), («2,J2) G Ti 



Fig. 1: SDP relaxation of Max PartCSP(P). 
2.2 The Unique Games Conjecture 

In this section, we state the formulation of the Unique Games Conjecture that we will use. 

Definition 2.1. An instance A = (U,V, E,U,[L]) of Unique Games consists of an unweighted 
bipartite multigraph G = (U U V, E), a set II of constraints, and a set [L] of labels. For each edge 
e £ E there is a constraint 7r e G II, which is a permutation on [L]. The goal is to find a labeling 
£ : U U V — > [L] of the vertices such that as many edges as possible are satisfied, where an edge 
e = (n, v) is said to be satisfied by £ if £(v) = ir e (£(u)). 



Maximize > E [PAx)] 

i 

Subject to in is an unbiased distribution over {—1, 1} T ' 
v- ■ G S n ' k 
Mi|t = Mi It 
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Definition 2.2. Given a Unique Game instance A = (U, V, E, IT, [L]), let Opt(A) denote the max- 
imum fraction of simultaneously satisfied edges of A by any labeling, i. e., 

Opt(A) := — - max |{e : t satisfies e}|. 



E 



e-.uuv^m 



Conjecture 2.3. ([KhoQ^J For every 7 > ; there is an integer L such that, for Unique Games 
instances A with label set [L] it is NP-hard to distinguish between 



• Opt(A) > 1 - 7 

• Opt(A) < 7. 



2.3 Analytic Tools 

Any Boolean function / : { — 1, l} r 



> M can be written uniquely as a multilinear polynomial 
/(*) = £ f(T)XT(x), 

TC[n] 



where f(T) are the Fourier coefficients of / and xt(%) = iLer 2 ^- As sucn i / can De viewed as a 
multilinear polynomial / : M. n — > M. and this is the view we commonly take. We write f- d for the 
part of / that is of degree < d, i.e., f- d (x) = J2\s\<d H s )xs{x). 

Fact 2.4. 



E[f(X)] = /(0) 



Var[/(X)] = ^/(T) 2 , 

T^0 



where the expectations are over a uniform X in { — 1, l} n . 
Definition 2.5. The influence of the i'th variable on / is 



Inf l (/) = ^/(T) 2 

T9i 



and the low-degree influence is 



Inff = Inf,(/ 



T3i 
\T\<d 



As is common, the main analytic tool in our hardness result is the invariance principle [MOO 10, 
IMoslO| . In particular, we have the following theorem. 

Theorem 2.6. For every k, e > there is a 5 > such that the following holds. 

Let /x be an unbiased distribution over {— 1,1} with min x€ ^_ 1 iyk n{x) > e, X be a random 
k x n matrix over {—1,1} with each column distributed according to [i, independently, and G be a 
random k x n matrix of standard Gaussians with the same covariance structure as X . 



Then for any k multilinear polynomials fi, ■ ■ ■ , fk ■ 
for all i G [k], j & [n], we have 



withlnif 1/5 (fi) < 5 andVax[fi] < 1 



E 
x 



Hfi(Xi 



Li=l 



E 
G 



< e. 



U=l 
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Theorem 12.6 
follows that Ex 



can be derived using Theorem 4.2 and Lemma 6.2 of jMoslOl : using Lemma 6.2 it 



is close to Ex 



nt^ 1/5 



(Xi 



for sufficiently small 5. Then we can 



use Theorem 4.2 on the functions {ff 1 ^}- In Theorem 4.2, the values of /j~ '(Xi) and /•~ 1 ' (5 (G ! j) 
are truncated to the range [0, 1]. By scaling this holds with [0, 1] replaced by some interval [— B, B\. 
It is not too hard to show that for sufficiently large B (as a function of k and e) truncation to the 
interval [-B,B] does not change E[Qi=i ff 1/5 ( x i)} (resp. E[n£ =1 ff 1/5 (Gi)]) by more than e. 



2.4 Products of Gaussians 



We need the following Lemma about the expectation of a product of gaussians in terms of their 
pairwise correlations. 

Lemma 2.7. Let r be an integer and let gi,...,g r be gaussians with mean 0, variance 1, and 
covariance matrix p. Then 



E 



n* 



e n pv> 

M ijeM 

where M ranges over all perfect matchings of the complete graph on r vertices (if r is even there 
are (r — 1)!! terms and if r is odd the expectation is 0). 



3 Decidability of m-vanishing Distributions 

Proposition 3.1. Given P and m, the existence of a A which is m-vanishing on P is decidable. 

Proof. There are M <2 k ■ k 2m graphs G\ , . . . , Gm of interest namely graphs with at most m edges 
supported on some vertex set S C [k]. We need to decide whether there is a distribution A on 
P-supported correlation matrices such that 

E [(p(G 1 ),...,p(G M ))]=0. 

peA 

By Caratheodory's theorem this implies that A can be assumed to have support at most M. Each 
p in the support of A can be represented using |P _1 (1)| + k 2 + 1 real variables, representing a 
distribution p over P _1 (l), its correlation matrix p, and finally its probability under A. The 
constraints that each p is the correlation matrix of the corresponding p and that A(Gj) = for 
every Gi can be written as a finite number of polynomial equations in the M(|P _1 (1)| + k 2 + 1) 
variables H In other words the set of m-vanishing A of support size < M form an algebraic set, 
so determining whether such A exists boils down to determining whether this algebraic set is non- 
empty, which is decidable [Tar51] . □ 



4 Algorithm 

In this section we give an approximation algorithm with approximation ratio strictly larger than 
|P~ 1 (l)|/2 fc for predicates P which do not satisfy the condition of Theorem 1 1.31 Thus, there exists 
an m such that for every distribution A (over P-supported correlation matrices), there is an odd- 
degree multigraph G over S C [k] with at most m edges such that P(S) ^ and A(G) 7^ 0. For 
the rest of this section, fix this value of m. 



2 The variables representing probabilities need to be non-negative. This can be effected by taking them to be 
squares of respective variables. 
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4.1 Rounding Monomials 

First, we give an algorithm which will allow us to "pick up" a contribution proportional to p(G) 
for any monomial, where p is the correlation matrix of the local distribution (given by the SDP) 
on that monomial and G is any graph. Lemma 14.11 below formalizes the high-level idea given in 
Section [1.21 Recall that the variables of the CSP are partitioned into k layers, there are n variables 
in each layer, and the SDP relaxation is as in Figure [TJ 

Lemma 4.1. Let S C [k] be a set of layers, G be an odd-degree multigraph on S, and r > 0. 
Then for all sufficiently large B > poly (log 1/t) (where the polynomial depends only on G) there is 
a polynomial time algorithm which, given an SDP solution as in Figure Q outputs an assignment 
a : S x [n] — > [— 1, 1] to the layers of S such that the following holds. 

Let V : S — > [n] be any choice of variables, one from each layer in S. Then 



E 



II a i,V{i) 



P(G) 
B\ s \ 



±T, 



where p is the correlation matrix defined by the SDP solution on these k variables, i.e., Pi lt i 2 

Proof. The algorithm works as follows. For each edge e G E(G), pick a standard Gaussian vector 
g g , independently. For a vertex u £ V(G), let E(u) C E(G) denote the set of edges incident on u. 
For a variable Xij such that i £ S, set 



*W= n (le>" 



i,3 



Then, set 



a 



Tf if|Aj|<B 
otherwise. 



Fix V : S — > [n] as in the statement. Now let us analyze EU^es &i,v(i)]- Note that without the 
truncation when |/3j,v(i)| exceeds B, the expectation would be exactly equal to (where in the second 
step below we use the independence of the Gaussians to move the expectation inside the product) 



E 



TD\S\ 



1 



1 

W\ 
1 

B\s\ 



E 



n n (le'Swi) 

II E [(l e .Ya,V(a))(g e 



(a,b)=e€E(G) 



Y\ VY a) y( a ),y bj y( b ) 
(a,b)eE(G) 



[ b,V(b) 



P(G) 
B\S\ 



Thus we want to bound the expectation of YlieS 



Y\ ieS cti,y(j) by r. This can be shown to 



be of order m -exp(— B 2 / m /2), because each Piyu) is a product of at most |-E(G)| < m independent 
gaussians. Thus setting B of order (logl/r) m//2 we get the desired error bound. □ 



4 Algorithm 



10 



4.2 Setup for the Algorithm 



Let M < 2 k -k 2m be the number of odd-degree graphs on some S C [k] with P(S) ^ and at most m 
edges, and let G\, . . . , Gm be these graphs. Further write St = V(G t ) C [k] for the vertex set of G t - 
For a correlation matrix p G M fcxfc , let be the vector g(p) = (p(Gi), p(G*2), . . . ,p(Gm)) G K M , 
and let Q C M M be the convex hull of {q(p) ■ p is P-supported}. 

Note that a A such that A(Gj) = for all 1 < i < M is precisely a convex combination A of p's 
such that Ep^A[9(p)] = Q- I n other words since P does not satisfy the condition of Theorem 11.31 we 
have that Q does not contain the origin. Furthermore Q is compact and so we can find a separating 
hyperplane (71, . . . ,7m) such that Y2u=i ltp{Gt) > 5 for every P-supported p and some universal 
constant 5 (depending only on P). 

Now let r = and set B = poly (log 1/r) large enough to make Lemma 14.11 work for all the 
graphs Gi, . . . ,Gm- In our algorithm, we are going to choose one t G [M] at random and then the 
algorithm is going to focus solely on the terms involving layers St- More precisely, as we shall see 



in the next section, t should be chosen with probability proportional to 
this to make sense, we therefore need that 



7« 



P(St) 



i?' St L In order for 



M 

E 



It 



P{St) 



■ P |5t| < 1. 



(1) 



Fortunately, we can assume without loss of generality that this holds: since B\ s *\ < B k depends 
sub-linearly (in fact even poly-logarithmically) on 1/r, dividing each 7^ by some factor / > 1 causes 
5 and r to also be divided by /, which in turn changes P' 5 '! to polylog(//r) = o(fB>\ St \), so that 
the sum in the left hand side of ([T|) decreases by a factor which is super-constant in /. Hence 
choosing / a sufficiently large constant, we can make (pQ) hold. 



4.3 The Rounding Algorithm 

We are now ready to describe the algorithm. Without loss of generality, we may assume that we 

are given a (1 — e)-satisfiable Max PartCSP(P) instance where e > is some sufficiently small 

constant (depending on P) to be determined later. If the instance is not (1 — e)-satisfiable then a 

\p— 1(1)1 

random assignment already gives an approximation ratio of 2 fc(i„ e ) • 

By Markov's inequality, for at least a 1—y/e fraction of constraints (Tj, Pi) we have [Pj(ic)] > 
1 — y/e. In other words, pi has a 1 — \fe fraction of its mass on P i _1 (l). 

Claim 4.2. Given a correlation matrix p G R fcxfc of a distribution which is (1 — y/e) -supported on 
P _1 (l), there is a P-supported correlation matrix p' such that \p(G) — p'(G)\ < y/e2 m for every G 
on m edges. 

Thus, setting e < we have that Yli HP(Gi) > S/2 for all correlation matrices of distributions 
which are (1 — y / e)-supported on P _1 (l). 

Now the rounding algorithm is as in Figure EJ 

Now, fix the value of t chosen in step 1, and let a^.,- be the rounded value to the variable Xij as 
in Lemma 14.11 and a^j be equal to ctij or its negation after the third step above. The assignment 
a is in [—1, l] nxfc but as the objective function is multilinear it can be greedily adjusted to an 
integral assignment in {—1, l} nxfc without decreasing the objective value, so it suffices to study a. 
Let S C [k] be some set of layers and V : S — > [n] be any choice of variables from these layers. 
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1. Pick t € [M] with probability- 



it 



P(St) 



2. Using Lemma |4.H round the variables in layers in St using graph Gt- For every other 
layer, set all the variables in that layer to 0. 

3. If sign(7fP(5f)) = —1, then select an odd sized subset A of S% at random, else select an 
even sized subset A of St at random. Flip the sign of all variables in layers in A. 

Fig. 2: Rounding algorithm for Max PartCSP(P) 



Observe that for every S ^ St, 

lies 

To see this, note that if S % St, then the variables in layers S \ St are set to 0. On the other 
hand if S C St, then we flip the signs of a random set of layers of either odd or even size. As the 
distribution over which layers get flipped is (\St\ — l)-wise independent, the layers of any S C St 
get flipped completely uniformly. 

On the other hand, by Lemma l4.1l and the way the signs are flipped in the third step, if S = St 
we have 

p(Gt) 



nl\a iyii) } = S ign( lt P(S t )) 



B\ s * 



±T. 



Thus, taking the expectation of ^[flies' &i,V(i)] over * £ [M] chosen according to Step 1, we have 



E 

t 



E 

t:St=S 

E 

t:S t =S 



It 



P(S t 

It 

Hs t ) 



sign( 7t P(S t )) 



pjG t ) 
B\St\ 



p(Gt) ± r 



Now we can analyze the probability that any specific constraint is satisfied. Let (Tj,Pj) be a 
constraint involving one variable from each layer which is (1 — y / e)-satisfied by the SDP solution. 
In other words, 'K Xr ^ fH [Pi(x)] > 1 — y/t. Write V : [k] — > [n] for the variables involved (i.e., 
Ti = {(*', V(i')) : i' S [k]}) and write P(x) = P(6iXi 5 v(i), • • • , hx k y(k)) for some signs h, . . . , b k G 
{-!,!}• 

We also associate the domain of p with { — l,l} fc in the obvious way. As such, it is easy to 
verify that the Fourier coefficient P(S') for S C [k] satisfies P(5) = P(S)xs(b)- Furthermore, let 
/Ji be the distribution over { — 1, l} k obtained by sampling from \i{ and performing coordinatewise 
multiplication by b, and let p (resp. p) denote the correlation matrix of pi (resp. fii). Then, for any 
graph G we have 

P{G)= Pa,a> = IJ Pa,a'X{a,a>}(b) = p(G) ■ XOdd(G)(b) 

(a,a')£E (a,a')eE 

where Odd(G) denotes the set of odd-degree vertices of G. In particular for G = Gt all vertices 
have odd degree so Odd(Gf) = St- Using this and noting that fii is (1 — y / e)-supported on satisfying 
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assignments of P we see that 

Y,^P{G t )xsM = > 5/2. 

t t 

We then have the following, where the expectation below is taken over all the random choices of 
the algorithm (including that of t € [M]). 

®[Pi(&i, v0 .),~' ,5fc,v( fc ))]=-Pi(0)+ ^ A(5)E[J]5 ijV(i) ] 

= P(0) + £ P(S)xsQ>) E 67FT^ G *) ±r 

0^SC[fc] i:S t =S 
P(S)^0 

= P(0) + E -ytp(G t )xst(b) ±tM> P(0) + -. 
t=i 

Thus the total fraction of constraints satisfied by the algorithm is in expectation at least (1 — 
v / e)(P(0) + 5/4) which is at least P(0) + 5/8 assuming e < (5/8) 2 . 

In other words, the algorithm finds a (P(0) + <5/8)-approximate solution on all instances with 
value at least 1 — e. Combining this with a random assignment gives an approximation better than 
P(0) for any instance, and concludes the proof of approximability of Max PartCSP(P). 

5 Hardness 

In this section we show that any P which satisfies the condition of Theorem 11.31 is approximation 
resistant, assuming the UGC. As usual, we prove hardness by designing an appropriate dictatorship 
test, which is given in Section [57TT followed by the (standard) hardness reduction in Section [5T2l 

5.1 Dictatorship Test 

Theorem 5.1. Let P satisfy the condition of Theorem M.SX Then for every k and e > there exists 
a 5 > such that the following holds for all n. 

There is a randomized algorithm T which, given oracle access to k odd functions fi, ■ ■ ■ , fk '■ 
{— 1, 1}" —7- [—1, 1], produces k queries X\, . . . ,X k £ {— 1, l} n such that 

(Yes) If f\(x) = f2(x) = . . . = fk(x) = Xi are the same dictator function, thenK[P(fi(Xi), . . . , f k (X k ))] > 
1 - e. 

(No) If all fi >s have Inff /<5 (/,) < 5 for all j G [n] then E[P(/ 1 (X 1 ), . . . , f k (X k ))} < P(0) + e 

Let m = k ■ n and let A be a distribution over P-supported correlation matrices which is m- 
vanishing on P. Let U k be the uniform distribution over { — 1, l} fc . The tester T is described in 
Figure [3l 

That the completeness is 1 — e follows immediately from supp(ry) C P _1 (l). 

Let us then analyze the soundness. The acceptance probability can be written as 



Pr[T accepts] = E 



E \P{h{X x ),...,f k {X k ))\ 

X ~p 



P(0) + E 

p 



£pos)E[n/i(*o] 



(2) 
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Input: functions /i, . . . , f k : {-1, l} n -)■ [-1, 1] 
Output: accept /reject 

1. Pick a random p ~ A. 

2. Let ry be an unbiased P-supported distribution with correlation matrix p (if there are 
many such r\ pick an arbitrary one in a deterministic fashion). 

3. Let p = (1 — e)rj + eU^- 

4. Pick a random k x n matrix X where each column is sampled independently according 
to p. 

5. Accept with probability P(/i(Xi), . . . , fk(X k )). 



Fig. 3: Dictatorship Test 



Let p' be the correlation matrix of p. Note that p' = (1 — e)p + eJ. Fix the value of /? for the 
moment, and let G be a random fcxn matrix of standard Gaussians with the same covariances as X 
(i.e., the columns are independent and in the j'th column we have E[GjjGv j] = p' i v = (1 — e)piy 
for i ^ i'). 

Next, set 5 small enough so that Theorem 12.61 gives that if hhy 1/<5 (/i) < 5 for all i 6 [it], j € [n] 
then 

f[n^(^)]-^[ii^ 1/5 (^)] ^ e / 2 " ( 3 ) 

ies ies 

for all S. Define f[ = f- 1 ^- We need to understand expressions of the form Eg [Flies fli^i)] for 
SC [k]. Expanding f[ = ^tcm f'(T)XT and applying Lemma [2T7T we obtain 



E 

G 



n //(Ci 

ies 



{T,} ies ies i=i 



n °v 

i-.jeTi 



E IP ' 11 E II 

{ lilies *GS j=l M j eM({i:ieT i }) i,i'eMj 



(4) 



where we write A4(S) for the set of perfect matchings on the complete graph with vertex set S. 

For a choice T = {?i}ies of Tj's, let c(T) = Flies A'C^)- Further, for a choice of matchings 
M = (Mi, . . . , M n ) let .FT (M) denote the multigraph being the union of Mi, . . . , M n . With a slight 
abuse of notation, write M(T) for the set of M's for a given T; i.e., M(T) = {(Mi, . . . , M n ) : Mj E 
: j £ ?i})- With all this cumbersome notation in place, the equation above simplifies to 



@ = E E c(Ty(iT(M)), 
T MeA4(T) 



(5) 



where T ranges over all {Tj C [n] },<=,$. Note that since fi (and therefore also f[) is odd, we can 
restrict the sum to T such that each |Tj| is odd, implying that H(M) is always odd degree. 
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Plugging © into ([3]), we have 



E[n/*M]-E e c (t)/-'w) 



< e/2 A 



*es T MgA4(T) 

Finally, plugging this into ([2]) and using the identity p' = (1 — e)p + el yields 



Pr[T accepts] < P(0) + E 



E E <T)p'(H(M)) + e/2 

\ T Mgtw(T) 
< P( 0) + e + £ P (5) £ £ c(X )(i - e )|s(H(M))| 

T MeM(T) 

= Pm + e, 



E[p(ff(M))] 
p 



where the last equality follows by the m- vanishing property of A: H(M) has at most n|5|/2 < nk = 
m edges, and so for each S either P(S) = or E p [p(H(M))] = A(JT(M)) = 0. 

5.2 Hardness Reduction 

Given the dictatorship test as in Theorem 15.11 a UGC-based hardness reduction can be designed 
in a standard manner. Some care needs to be taken however to ensure that the CSP instance 
produced by the reduction is A;-partite. As is standard, we present the reduction as a Probabilis- 
tically Checkable Proof (PCP) for NP whose acceptance predicate matches the predicate P, has 
completeness 1 — o(l) and soundness ^ P 2 ^ + o(l). 

The PCP is based on the conjectured NP-hard instance A = (U,V, E,T1, [L]) of Unique Games 
as in Definition EU Let L and 7 be as in Conjecture 12.31 The PCP proof consists of k layers where 



the bits in the i'th layer correspond to Vi x {—1, 1}^ and Vi is a copy of the "right hand side" V 
of the UG instance. For any V{ G Vi{= V), the set of bits {v{} x {—1, 1} L correspond to the bits of 
the long code of the label of Uj. In a "correct" proof, the assignment to these bits corresponds to 
a dictatorship function f(x) = Xj where j G [L] is the intended label of Vi. 

For a function g : { — 1, 1} L — > {—1, 1} and a permutation it : [L] — > [L], let goir~ l : {—1, 1} L — > 
{—1,1} denote the function defined as j o tt^ 1 (x) = g(x n -im, . . . The PCP verifier 

proceeds as in Figure [H 



1. Pick a random vertex u G U. 

2. Pick k random neighbors of n, namely vi, . . . ,Vk G V . 

3. Let (71, . . . , be the functions (supposed long codes) for v\ G V\, . . . , v& G respectively. 

4. Let fi,...,fk be the permuted versions of gi, ■ ■ ■ ,gk respectively, i.e., fi = gi° tt^~ , 



7T,; = 7T, 



ei = (u,Vi) 



for 1 < i < fc. 



5. Run the dictatorship test as in Theorem 15.11 on . . . , 



Fig. 4: PCP Verifier 
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5.2.1 Completeness 

Let I : U U V — > [L] be a labeling to the UG instance that satisfies 1 — 7 fraction of its edges. 
For every vi G V$(= V), let be the long code of £(vi), i.e. gi{x) = X£t v .\. With probability at 
least 1 — &7, all fc edges (u, t>i), . . . , (u, v k ) are satisfied by the labeling and whenever this holds, the 
dictatorship test accepts with probability 1 — e. The latter conclusion follows by observing that if 
iri(£(u)) = £(vi) for every 1 < % < k, then in the PCP test above, 

fi(x) =9i° ^(x) = 9^-1^, . . • , X n -1 {L) ) = X^-l {l{Vi)) = X £{u) , 

and hence /1, . . . , f k are identical dictatorship functions. 



5.2.2 Soundness 

Assume that the soundness of the UG instance is at most 7 which is chosen to be sufficiently small 
beforehand. Fix any layer i in the PCP proof and let gi tV be the supposed long code corresponding 
to the vertex v (in the i'th layer). For any u G U, define the function which is the average of 
functions over the neighbors of u after appropriate permutation: 



fi, v (x) = E g { 

v:(u,v)€E I 



v o ir< 1 s(x) 
" (u,v) v ' 



Note that /j )U are [— 1, l]-valued. By a standard argument, we may assume that for all but ^/j 
fraction of u G U, the function f,^ u has no coordinate that has degree 1/5 influence 5 (referred to 
as a low-influence function for brevity). 

Otherwise, suppose that for ^7 fraction of u, fi >u has a coordinate that has degree 1/5 influence 
5. For brevity, call any such coordinate simply as an influential coordinate. The set of all influential 
coordinates has size bounded by 1/5 2 . Assign this bounded set as the set of candidate labels for 
u. For any influential coordinate j G [L], since fi >u is an average of gi jV o 7T7 1 x over neighbors of 
u, by an averaging argument, for at least 5/2 fraction of the neighbors, TT( UjV )(j) is influential for 
gi >v . All influential coordinates of gi jV are assigned as the candidate labels for v. Now define a 
(randomized) labeling that selects one label at random from the candidate set of each vertex. The 
argument sketched implies that this labeling satisfies -V7 • 5/2 • 5 4 fraction of the UG edges. This 
is a contradiction if the soundness 7 was chosen to be sufficiently small to begin with. 

Hence except with probability k^/j, the PCP verifier chooses u G U such that the k functions 
fi >u , one in each layer, are all low influence functions. Whenever this holds, the analysis of the 
dictatorship test implies that the verifier accepts with probability at most - — 2 ^ n + e. One only 
needs to observe that for a fixed u, the verifier picks its random neighbor in each layer and hence 
running the test on these random neighbors (one in each layer) has the same effect as running the 
test on the (possibly non-boolean) averaged functions (again, one in each layer). Formally, fixing 
u, 



E 

V!,...,V k 



E[P( gi , Vl o vrr 1 ^), . . . , g Kvk o 7T^(X k ))} 



= E 
r 



P(E[ 5Ml o vrr 1 ^)], . . . , E[g k , Vk o ^\X k )}) 

v l v k 



E [P(/i, u (Xi), . . . , fk,u( X k))] ■ 
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